Skip to content

Add Spark interop test for reading Puffin deletion vectors#3476

Open
moomindani wants to merge 3 commits into
apache:mainfrom
moomindani:moomindani/puffin-dv-reader-interop-test
Open

Add Spark interop test for reading Puffin deletion vectors#3476
moomindani wants to merge 3 commits into
apache:mainfrom
moomindani:moomindani/puffin-dv-reader-interop-test

Conversation

@moomindani

Copy link
Copy Markdown

Rationale for this change

Extracted from #3474 per review feedback (#3474 (comment)): this integration test verifies that PyIceberg can read Puffin deletion vectors written by Spark, which holds independently of the PuffinWriter changes in that PR.

Are these changes tested?

This PR is test-only. The test passed CI as part of #3474 (integration-test job) before being extracted.

Are there any user-facing changes?

No.

@ebyhr ebyhr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for comments.

Comment thread tests/integration/test_puffin_spark_interop.py Outdated
Comment thread tests/integration/test_puffin_spark_interop.py Outdated
Comment thread tests/integration/test_puffin_spark_interop.py Outdated
- Move test_read_spark_written_puffin_dv from test_puffin_spark_interop.py
  to test_deletes.py; delete the old file
- Replace run_spark_commands helper calls with direct spark.sql() calls
- Tighten entries assertion from > 0 to == 1

@rambleraptor rambleraptor left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks for doing this!

@moomindani

Copy link
Copy Markdown
Author

Hi @ebyhr, thanks for the review! I've addressed all your comments (moved the test to test_deletes.py, replaced run_spark_commands with direct spark.sql() calls, and switched to the == 1 assertions). Could you take another look and merge if it looks good? CI is green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants